Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Focused crawler method combining ontology and improved Tabu search for meteorological disaster
LIU Jingfa, GU Yaoping, LIU Wenjie
Journal of Computer Applications    2020, 40 (8): 2255-2261.   DOI: 10.11772/j.issn.1001-9081.2019122238
Abstract387)      PDF (1325KB)(439)       Save
Considering the problems that the traditional focused crawler is easy to fall into local optimum and has insufficient topic description, a focused crawler method combining Ontology and Improved Tabu Search (On-ITS) was proposed. First, the topic semantic vector was calculated by ontology semantic similarity, and the Web page text feature vector was constructed by Hyper Text Markup Language (HTML) Web page text feature position weighting. Then, the vector space model was used to calculate the topic relevance of Web pages. On this basis, in order to analyze the comprehensive priority of link, the topic relevance of the link anchor text and the PR (PageRank) value of Web page to the link were calculated. In addition, to avoid the crawler falling into local optimum, the focused crawler based on ITS was designed to optimize the crawling queue. Experimental results of the focused crawler on the topics of rainstorm disaster and typhoon disaster show that, under the same environment, the accuracy of the On-ITS method is higher than those of the contrast algorithms by maximum of 58% and minimum of 8%, and other evaluation indicators of the proposed algorithm are also very excellent. On-ITS focused crawler method can effectively improve the accuracy of obtaining domain information and catch more topic-related Web pages.
Reference | Related Articles | Metrics